An Efficient Dimensionality Reduction Approach for Small-sample Size and High-dimensional Data Modeling
نویسندگان
چکیده
As for massive multidimensional data are being generated in a wide range of emerging applications, this paper introduces two new methods of dimension reduction to conduct small-sample size and high-dimensional data processing and modeling. Through combining the support vector machine (SVM) and recursive feature elimination (RFE), SVM-RFE algorithm is proposed to select features, and further, adding the higher order singular value decomposition (HOSVD) to the feature extraction which involves successfully organizing the data into high order tensor pattern. The validation of simulation experiment data shows that the proposed novel feature selection and feature extraction methods can be effectively applied to the research work for analyzing and modeling the data of atmospheric corrosion. The feature selection method pledges that the remaining feature subset is optimal; feature extraction method reserves the original structure, discriminate information, and the integrity of data, etc. Finally, this paper proposes a complete data dimensionality reduction solution that can effectively solve the high-dimensional small sample data problem, and code programming for this solution has been implemented.
منابع مشابه
A Monte Carlo-Based Search Strategy for Dimensionality Reduction in Performance Tuning Parameters
Redundant and irrelevant features in high dimensional data increase the complexity in underlying mathematical models. It is necessary to conduct pre-processing steps that search for the most relevant features in order to reduce the dimensionality of the data. This study made use of a meta-heuristic search approach which uses lightweight random simulations to balance between the exploitation of ...
متن کاملRobust high-dimensional semiparametric regression using optimized differencing method applied to the vitamin B2 production data
Background and purpose: By evolving science, knowledge, and technology, we deal with high-dimensional data in which the number of predictors may considerably exceed the sample size. The main problems with high-dimensional data are the estimation of the coefficients and interpretation. For high-dimension problems, classical methods are not reliable because of a large number of predictor variable...
متن کاملFeature Selection for Small Sample Sets with High Dimensional Data Using Heuristic Hybrid Approach
Feature selection can significantly be decisive when analyzing high dimensional data, especially with a small number of samples. Feature extraction methods do not have decent performance in these conditions. With small sample sets and high dimensional data, exploring a large search space and learning from insufficient samples becomes extremely hard. As a result, neural networks and clustering a...
متن کاملKernel-based Fuzzy Feature Extraction Method and Its Application to Face Image Classification
The Hughes phenomenon (or the curse of dimensionality) shows two essential directions for improving the classification performance on high-dimensional and small sample size (SSS) problems. One is to reduce the dimensionality of applied data by feature extraction or feature selection methods. The other is to increase the training sample size. In recent years some kernel-based feature extraction ...
متن کاملA Framework for Local Supervised Dimensionality Reduction of High Dimensional Data
High dimensional data presents a challenge to the classification problem because of the difficulty in modeling the precise relationship between the large number of feature variables and the class variable. In such cases, it may be desirable to reduce the information to a small number of dimensions in order to improve the accuracy and effectiveness of the classification process. While data reduc...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- JCP
دوره 9 شماره
صفحات -
تاریخ انتشار 2014